data sanitization in association rule mining based on impact factor

نویسندگان

a. telikani

a. shahbahrami

r. tavoli

چکیده

data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. it transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved against association rule mining method. this process strongly rely on the minimizing the impact of data sanitization on the data utility by minimizing the number of lost patterns in the form of non-sensitive patterns which are not mined from sanitized database. this study proposes a data sanitization algorithm to hide sensitive patterns in the form of frequent itemsets from the database while controls the impact of sanitization on the data utility using estimation of impact factor of each modification on non-sensitive itemsets. the proposed algorithm has been compared with sliding window size algorithm (swa) and max-min1 in term of execution time, data utility and data accuracy. the data accuracy is defined as the ratio of deleted items to the total support values of sensitive itemsets in the source dataset. experimental results demonstrate that proposed algorithm outperforms swa and max-min1 in terms of maximizing the data utility and data accuracy and it provides better execution time over swa and max-min1 in high scalability for sensitive itemsets and transactions.

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data sanitization in association rule mining based on impact factor

Data sanitization is a process that is used to promote the sharing of transactional databases among organizations and businesses, it alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved ag...

متن کامل

Data sanitization in association rule mining based on impact factor

Data sanitization process is used to promote the sharing of transactional databases among organizations and businesses, and alleviates concerns for individuals and organizations regarding the disclosure of sensitive patterns. It transforms the source database into a released database so that counterparts cannot discover the sensitive patterns and so data confidentiality is preserved against ass...

متن کامل

Generalized Association Rule Mining Algorithms Based on Multidimensional Data

This paper proposes a new formalized definition of generalized association rule based on Multidimensional data. The algorithms named BorderLHSs and GenerateLHSs-Rule are designed for generating generalized association rule from multi-level frequent item sets based on Multidimensional Data. Experiment shows that the algorithms proposed in this paper are more efficiency, generate less redundant r...

متن کامل

Association Rule Mining on Distributed Data

Applications requiring large data processing, have two major problems, one a huge storage and its management and second processing time, as the amount of data increases. Distributed databases solve the first problem to a great extent but second problem increases. Since, current era is of networking and communication and people are interested in keeping large data on networks, therefore, researc...

متن کامل

Privacy Preserving Association Rule Mining based on the Intersection Lattice and Impact Factor of Items

Association Rules revealed by association rule mining may contain some sensitive rules, which may cause prospective threats towards privacy and protection. A number of researchers in this area have recently made efforts to preserve privacy for sensitive association rules in transactional databases. In this paper, we put forward a heuristic based association rule hiding algorithm to get rid of t...

متن کامل

Association Rule Mining Based On Trade List

In this paper a new mining algorithm is defined based on frequent item set. Apriori Algorithm scans the database every time when it finds the frequent item set so it is very time consuming and at each step it generates candidate item set. So for large databases it takes lots of space to store candidate item set .In undirected item set graph, it is improvement on apriori but it takes time and sp...

متن کامل

منابع من

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

عنوان ژورنال:

journal of ai and data mining

ناشر: shahrood university of technology

ISSN 2322-5211

دوره 3

شماره 2 2015

کلمات کلیدی

data sanitization association rule hiding frequent itemsets association rule mining privacy preserving data mining

میزبانی شده توسط پلتفرم ابری doprax.com